Yellowknife
Personalized Reasoning: Just-In-Time Personalization and Why LLMs Fail At It
Li, Shuyue Stella, Bose, Avinandan, Brahman, Faeze, Du, Simon Shaolei, Koh, Pang Wei, Fazel, Maryam, Tsvetkov, Yulia
Current large language model (LLM) development treats task-solving and preference alignment as separate challenges, optimizing first for objective correctness, then for alignment to aggregated human preferences. This paradigm fails in human-facing applications where solving a problem correctly is insufficient if the response mismatches the user's needs. This challenge intensifies in just-in-time scenarios where no prior user interaction history exists due to cold-start conditions or privacy constraints. LLMs need to identify what they don't know about user preferences, strategically elicit preference values through questioning, then adapt their reasoning processes and responses accordingly -- a complicated chain of cognitive processes which we term personalized reasoning. We introduce PREFDISCO, an evaluation methodology that transforms static benchmarks into interactive personalization tasks using psychologically-grounded personas with sparse preferences. Our framework creates scenarios where identical questions require different reasoning chains depending on user context, as optimal explanation approaches vary by individual expertise and preferences while maintaining factual accuracy. Evaluation of 21 frontier models across 10 tasks reveals 29.0% of naive personalization attempts produce worse preference alignment than generic responses, yet generic responses also fail to serve individual user needs effectively. These findings suggest personalized reasoning requires dedicated development rather than emerging naturally. PREFDISCO establishes personalized reasoning as a measurable research frontier and reveals fundamental limitations in current LLMs' interactive capabilities, providing a foundation for developing systems that can adapt to individual users in education, healthcare, and technical domains where personalization is critical.
- Asia > Middle East > Jordan (0.05)
- Europe > Sweden > Stockholm > Stockholm (0.04)
- Asia > Japan > Honshū > Kansai > Osaka Prefecture > Osaka (0.04)
- (6 more...)
- Health & Medicine > Therapeutic Area > Cardiology/Vascular Diseases (1.00)
- Education (1.00)
- Health & Medicine > Therapeutic Area > Obstetrics/Gynecology (0.93)
- Health & Medicine > Diagnostic Medicine (0.68)
- Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
- Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
- Information Technology > Artificial Intelligence > Cognitive Science > Problem Solving (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)
Scaling Down Semantic Leakage: Investigating Associative Bias in Smaller Language Models
Semantic leakage is a phenomenon recently introduced by Gonen et al. (2024). It refers to a situation in which associations learnt from the training data emerge in language model generations in an unexpected and sometimes undesired way. Prior work has focused on leakage in large language models (7B+ parameters). In this study, I use Qwen2.5 model family to explore whether smaller models, ranging from 500M to 7B parameters, demonstrate less semantic leakage due to their limited capacity for capturing complex associations. Building on the previous dataset from Gonen et al. (2024), I introduce a new dataset of color-focused prompts, categorized into specific types of semantic associations, to systematically evaluate the models' performance. Results indicate that smaller models exhibit less semantic leakage overall, although this trend is not strictly linear, with medium-sized models sometimes surpassing larger ones in leaking behavior. The dataset, the model generations, and the evaluation code are publicly available at https://github.com/smilni/semantic_leakage_project.
- North America > United States > West Virginia (0.04)
- North America > United States > Washington > King County > Redmond (0.04)
- North America > United States > Texas > Orange County > Orange (0.04)
- (11 more...)
- Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
- Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
- Information Technology > Artificial Intelligence > Natural Language > Chatbot (0.96)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.48)
Scalable mixed-domain Gaussian process modeling and model reduction for longitudinal data
Timonen, Juho, Lähdesmäki, Harri
Gaussian process (GP) models that combine both categorical and continuous input variables have found use in longitudinal data analysis of and computer experiments. However, standard inference for these models has the typical cubic scaling, and common scalable approximation schemes for GPs cannot be applied since the covariance function is non-continuous. In this work, we derive a basis function approximation scheme for mixed-domain covariance functions, which scales linearly with respect to the number of observations and total number of basis functions. The proposed approach is naturally applicable to also Bayesian GP regression with discrete observation models. We demonstrate the scalability of the approach and compare model reduction techniques for additive GP models in a longitudinal data context. We confirm that we can approximate the exact GP model accurately in a fraction of the runtime compared to fitting the corresponding exact model. In addition, we demonstrate a scalable model reduction workflow for obtaining smaller and more interpretable models when dealing with a large number of candidate predictors.
- Europe > Austria > Vienna (0.14)
- North America > United States > New York > New York County > New York City (0.04)
- North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
- (12 more...)
Sparse Variational Contaminated Noise Gaussian Process Regression with Applications in Geomagnetic Perturbations Forecasting
Iong, Daniel, McAnear, Matthew, Qu, Yuezhou, Zou, Shasha, Toth, Gabor, Chen, Yang
GPR models can also incorporate prior knowledge through selecting an appropriate kernel function. GPR commonly assumes a homoscedastic Gaussian distribution for observation noise because this yields an analytical form for the posterior predictive prediction. However, Bayesian inference based on Gaussian noise distributions is known to be sensitive to outliers which are defined as observations that strongly deviate from model assumptions. In regression, outliers can arise from relevant inputs being absent from the model, measurement error, and other unknown sources. These outliers are associated with unconsidered sources of variation that affect the target variable sporadically. In this case, the observation model is unable to distinguish between random noise and systematic effects not captured by the model. In the context of GPR under Gaussian noise, outliers can heavily influence the posterior predictive distribution, resulting in a biased estimate of the mean function and overly confident prediction intervals. Therefore, robust observation models are desired in the presence of potential outliers.
- North America > United States > Michigan (0.04)
- Europe > United Kingdom > England > Oxfordshire > Oxford (0.04)
- Asia > Middle East > Jordan (0.04)
- (5 more...)
- Energy (0.67)
- Transportation (0.48)
- Consumer Products & Services > Travel (0.47)
Locomotion as Manipulation with ReachBot
Chen, Tony G., Newdick, Stephanie, Di, Julia, Bosio, Carlo, Ongole, Nitin, Lapotre, Mathieu, Pavone, Marco, Cutkosky, Mark R.
Caves and lava tubes on the Moon and Mars are sites of geological and astrobiological interest but consist of terrain that is inaccessible with traditional robot locomotion. To support the exploration of these sites, we present ReachBot, a robot that uses extendable booms as appendages to manipulate itself with respect to irregular rock surfaces. The booms terminate in grippers equipped with microspines and provide ReachBot with a large workspace, allowing it to achieve force closure in enclosed spaces such as the walls of a lava tube. To propel ReachBot, we present a contact-before-motion planner for non-gaited legged locomotion that utilizes internal force control, similar to a multi-fingered hand, to keep its long, slender booms in tension. Motion planning also depends on finding and executing secure grips on rock features. We use a Monte Carlo simulation to inform gripper design and predict grasp strength and variability. Additionally, we use a two-step perception system to identify possible grasp locations. To validate our approach and mechanisms under realistic conditions, we deployed a single ReachBot arm and gripper in a lava tube in the Mojave Desert. The field test confirmed that ReachBot will find many targets for secure grasps with the proposed kinematic design.
- North America > United States > California > Santa Clara County > Palo Alto (0.04)
- North America > Panama (0.04)
- North America > Dominican Republic > Azua > Azua (0.04)
- (5 more...)
- Government > Regional Government > North America Government > United States Government (1.00)
- Energy (1.00)
- Information Technology > Artificial Intelligence > Robots > Locomotion (1.00)
- Information Technology > Artificial Intelligence > Robots > Manipulation (0.88)
LLM Processes: Numerical Predictive Distributions Conditioned on Natural Language
Requeima, James, Bronskill, John, Choi, Dami, Turner, Richard E., Duvenaud, David
Machine learning practitioners often face significant challenges in formally integrating their prior knowledge and beliefs into predictive models, limiting the potential for nuanced and context-aware analyses. Moreover, the expertise needed to integrate this prior knowledge into probabilistic modeling typically limits the application of these models to specialists. Our goal is to build a regression model that can process numerical data and make probabilistic predictions at arbitrary locations, guided by natural language text which describes a user's prior knowledge. Large Language Models (LLMs) provide a useful starting point for designing such a tool since they 1) provide an interface where users can incorporate expert insights in natural language and 2) provide an opportunity for leveraging latent problem-relevant knowledge encoded in LLMs that users may not have themselves. We start by exploring strategies for eliciting explicit, coherent numerical predictive distributions from LLMs. We examine these joint predictive distributions, which we call LLM Processes, over arbitrarily-many quantities in settings such as forecasting, multi-dimensional regression, black-box optimization, and image modeling. We investigate the practical details of prompting to elicit coherent predictive distributions, and demonstrate their effectiveness at regression. Finally, we demonstrate the ability to usefully incorporate text into numerical predictions, improving predictive performance and giving quantitative structure that reflects qualitative descriptions. This lets us begin to explore the rich, grounded hypothesis space that LLMs implicitly encode.
- North America > Canada > Ontario > Toronto (0.28)
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.14)
- North America > Canada > Quebec > Montreal (0.04)
- (13 more...)
- Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.66)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.50)
Probabilistic Semantic Data Association for Collaborative Human-Robot Sensing
Wakayama, Shohei, Ahmed, Nisar
Humans cannot always be treated as oracles for collaborative sensing. Robots thus need to maintain beliefs over unknown world states when receiving semantic data from humans, as well as account for possible discrepancies between human-provided data and these beliefs. To this end, this paper introduces the problem of semantic data association (SDA) in relation to conventional data association problems for sensor fusion. It then develops a novel probabilistic semantic data association (PSDA) algorithm to rigorously address SDA in general settings, unlike previous work on semantic data fusion which developed heuristic techniques for specific settings. PSDA is further incorporated into a recursive hybrid Bayesian data fusion scheme which uses Gaussian mixture priors for object states and softmax functions for semantic human sensor data likelihoods. Simulations of a multi-object search task show that PSDA enables robust collaborative state estimation under a wide range of conditions where semantic human sensor data can be erroneous or contain significant reference ambiguities.
- North America > United States > Colorado > Boulder County > Boulder (0.14)
- Asia > Japan > Honshū > Kansai > Wakayama Prefecture > Wakayama (0.04)
- North America > United States > New York > New York County > New York City (0.04)
- (2 more...)
NASA releases new panoramic image of Mars to celebrate Curiosity rover's 9th anniversary
NASA has marked the Curiosity rover's ninth anniversary on Mars by unveiling a new panoramic image of the Martian landscape, a locale that may explain why the Red Planet became dry. The panoramic image, which was put together on July 3 by stitching 129 individual images together, shows Curiosity's current home, Mount Sharp, a 5-mile-tall mountain inside Mars' Gale Crater. NASA marked the Curiosity rover's ninth anniversary on Mars by unveiling a new panoramic image. The image was created by the rover's Mast Camera, or Mastcam. Upon arrival at Mount Sharp in 2014, Curiosity has been traveling up the rock formation for the past several years.
- North America > United States > California > San Diego County > San Diego (0.06)
- North America > United States > Florida > Brevard County > Cape Canaveral (0.05)
- North America > Canada > Northwest Territories > Yellowknife (0.05)
- (2 more...)
- Government > Space Agency (0.98)
- Government > Regional Government > North America Government > United States Government (0.96)
- Government > Military (0.71)
NASA's Curiosity Mars rover takes a selfie with the 20ft-tall 'Mont Mercou' rock formation
At first glance at this image, you'd be forgiven for mistaking it as a still from the latest science fiction blockbuster. But the photo is very much real, and was snapped by NASA's Curiosity Mars rover this week. The selfie shows the rover alongside a rock formation dubbed'Mont Mercou', a nickname taken from a mountain in France. And while the photo is impressive on its own, it was actually taken to celebrate Curiosity's 30th sample to date, after the rover drilled a hole at a nearby rock sample nicknamed'Nontron.' The selfie shows the rover alongside a rock formation dubbed'Mont Mercou', a nickname taken from a mountain in France So far 2021 has been the'year of Mars' with three spaceships from Earth arriving at the Red Planet.
- Europe > France (0.47)
- Asia > China (0.06)
- North America > United States > Florida > Brevard County > Cape Canaveral (0.05)
- (2 more...)
- Government > Regional Government > North America Government > United States Government (0.78)
- Government > Space Agency (0.67)
- Government > Military > Air Force (0.51)
Photographer captures highest resolution shots of snowflakes ever
A renowned photographer has captured the highest resolution shots of snowflakes ever using a homemade prototype described as one part microscope and one part camera. Nathan Myhrvold, an American scientist, inventor, photographer and ex-chief technology officer of Microsoft, took 18 months to build the 100 megapixel camera capable of capturing a snowflake's microscopic detail. Using the camera, which he describes as the'highest resolution snowflake camera in the world', he took 100 frames of each snowflake in quick succession then stacked them for the whole image to be in focus. The results show the lush variety of snowflakes measuring only a few tens of millimetres in diameter, captured when Myhrvold was in Alaska and Canada. Pictured, stellar dendrite captured in Yellowknife, Canada.
- North America > Canada > Northwest Territories > Yellowknife (0.29)
- North America > Canada > Yukon > Whitehorse (0.06)
- North America > United States > Washington > King County > Bellevue (0.05)
- (2 more...)
- Media > Photography (0.52)
- Information Technology (0.35)